Skip to content

fix: close Pebble batches on all paths, correct lock scan upper bound, unexport TxnInternalKeyPrefix#433

Closed
Copilot wants to merge 14 commits intofeature/refactorfrom
copilot/sub-pr-429
Closed

fix: close Pebble batches on all paths, correct lock scan upper bound, unexport TxnInternalKeyPrefix#433
Copilot wants to merge 14 commits intofeature/refactorfrom
copilot/sub-pr-429

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 25, 2026

Addresses resource leaks and a correctness bug flagged in review of the previous Pebble/MVCC hardening PR.

Pebble batch lifecycle fixes (store/lsm_store.go)

Pebble batches must be Close()d after Commit() to release memory; several paths omitted this:

  • findMaxCommitTS: replaced per-path b.Close() with defer b.Close() — previously leaked on the success path
  • compactFlushBatch: capture old batch before replacing, call Close() after successful commit
  • Compact: close batch after final commit when batchCount > 0
  • restoreMVCCEntries / restoreGobEntries: close each committed batch before rotation, and close the final batch on both success and error paths

Lock resolver scan bound (kv/lock_resolver.go)

append(txnLockKey(nil), 0xFF) is not a valid exclusive upper bound — user keys containing 0xFF bytes sort after it, causing those locks to be silently skipped. Replaced with prefixScanEnd(txnLockPrefixBytes), which increments the last non-0xFF byte of the prefix:

// Before — incorrect when user keys contain 0xFF
lockEnd := append(txnLockKey(nil), txnNamespaceSentinel)

// After — correct exclusive upper bound for any user key content
lockEnd := prefixScanEnd(txnLockPrefixBytes)

Unexport TxnInternalKeyPrefix (store/store.go)

TxnInternalKeyPrefix was an exported mutable []byte, allowing callers outside the package to modify it at runtime. Renamed to txnInternalKeyPrefix; all usages are within store/ (lsm_store.go, mvcc_store.go, lsm_store_txn_test.go).


📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.

bootjp and others added 13 commits March 25, 2026 04:49
- pebbleStore: make saveLastCommitTS durable (Sync) and atomic within
  ApplyMutations WriteBatch to prevent timestamp rollback on crash
- pebbleStore: hold mutex across conflict check and batch commit to
  eliminate TOCTOU race in ApplyMutations
- pebbleStore: use bounded iterators in GetAt/LatestCommitTS instead of
  unbounded NewIter(nil) to reduce Pebble overhead
- pebbleStore: switch PutAt/DeleteAt/ExpireAt to NoSync since Raft log
  already provides durability for FSM-driven writes
- mvccStore: replace tree.Each() with Iterator loop in ScanAt for early
  termination on limit, avoiding full treemap traversal
- fsm: add prefix byte to Raft command encoding to eliminate double
  protobuf deserialization on the hot path (backward compatible)
- txn_keys: pre-allocate []byte prefix slices at package level and add
  common prefix fast-path check to eliminate per-call heap allocations
- sharded_coordinator: log errors in abortPreparedTxn instead of
  silently discarding them
- coordinator: add 5s timeout to redirect gRPC forward calls
- shard_store: add 5s timeout to all proxy gRPC calls
- adapter/internal: add VerifyLeader quorum check to Forward handler
- docs: add comprehensive review_todo.md with remaining issues
…paction

- pebbleStore: implement Compact() with batched MVCC garbage collection
  that iterates keys, keeps one version <= minTS per user key, and
  deletes older versions in batches of 1000
- pebbleStore: implement RetentionController interface (MinRetainedTS /
  SetMinRetainedTS) so FSMCompactor no longer skips pebbleStore
- mvccStore: split Compact into 2-phase approach — scan under RLock to
  collect candidates, then apply updates in batched Lock/Unlock cycles
  (batch size 500) to avoid blocking all reads/writes during compaction
- docs: update review_todo.md marking completed items

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ding

- store: add TxnInternalKeyPrefix constant for compaction filtering
- mvccStore/pebbleStore: skip keys with !txn| prefix during Compact()
  to prevent breaking lock resolution by pruning commit/rollback records
- txn_codec: replace bytes.Buffer with direct slice encoding in
  EncodeTxnMeta, encodeTxnLock, encodeTxnIntent to eliminate per-call
  heap allocations
- docs: update review_todo.md marking 1.6 and 3.8 as done, downgrade 1.2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nsafe

- Fix meta key namespace collision (4.1): prefix with \x00, add backward
  compat migration in findMaxCommitTS, isMetaKey helper for scan/compact
- Add background LockResolver (1.4/2.3): periodically scans expired locks
  on leader nodes and resolves orphaned commit/abort locks
- Add forward retry with limit (2.2): forwardWithRetry re-fetches leader
  address on each failure, bounds retries to 3
- Add decodeKeyUnsafe (3.9): zero-copy key decode for temporary comparisons
  in hot iteration paths (GetAt, ExistsAt, LatestCommitTS, Compact)
- Document Snapshot Isolation (4.2): clarify write-skew limitation in
  handleOnePhaseTxnRequest and MVCCStore.ApplyMutations
- Document cross-shard scan limitation (4.5) and VerifyLeader TOCTOU (4.3)
- Mark DynamoDB ConditionCheck (4.4) as already addressed

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…Dispatch

- FSM Abort path (5.1): 10 tests covering Prepare→Abort flow, commit
  rejection, lock/intent cleanup, rollback record, edge cases
- PebbleStore transactions (5.2): 14 tests covering ApplyMutations
  conflict detection, atomic batch, TTL, LatestCommitTS, Compact with
  meta/txn key preservation
- Coordinate.Dispatch (5.3): 6 tests covering raw put/del, one-phase
  txn, validation, and automatic startTS assignment

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ShardedCoordinator abort (5.4): test verifying that when Shard2 Prepare
  fails, Shard1 receives correct Abort and locks are cleaned up
- Concurrent access (5.6): 7 new race-detected tests covering concurrent
  PutAt, GetAt+PutAt, ApplyMutations, ScanAt+PutAt, and scan snapshot
  consistency for MVCCStore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cated prefix bytes

Co-authored-by: bootjp <1306365+bootjp@users.noreply.github.com>
Agent-Logs-Url: https://github.com/bootjp/elastickv/sessions/8c736c29-0405-4197-93a2-8f290254f51d
…rdinator_abort_test.go, and lsm_store.go

Co-authored-by: bootjp <1306365+bootjp@users.noreply.github.com>
Agent-Logs-Url: https://github.com/bootjp/elastickv/sessions/aed61b49-9084-425e-bd0e-899be4c0bf9d
…ction

- kv/fsm.go: extract decodeLegacyRaftRequest from decodeRaftRequests default case
- cmd/server/demo.go: extract setupFSMStore from run function
- store/lsm_store.go: extract compactEntryState/shouldDeleteEntry/compactFlushBatch
  for Compact; extract restoreMVCCEntries for restoreFromStreamingMVCC;
  extract restoreGobEntries for restoreFromLegacyGob; add hash import
- store/mvcc_store.go: lift compactEntry to package scope; extract
  computeScanCapHint/collectScanResults for ScanAt; extract
  compactPhase1/compactPhase2 for Compact
- store/mvcc_store_concurrency_test.go: extract goroutine bodies into
  concurrentGetAtWriter/concurrentGetAtReader, scanAtWriter/scanAtScanner,
  snapshotConsistencyWriter/snapshotConsistencyScanner helpers
- kv/lock_resolver.go: fix pre-existing undefined ok/userKey variables by
  using txnUserKeyFromLockKey; apply gofmt alignment

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…etypeassert, wrapcheck, mnd, gosec, unused)

Co-authored-by: bootjp <1306365+bootjp@users.noreply.github.com>
Agent-Logs-Url: https://github.com/bootjp/elastickv/sessions/aed61b49-9084-425e-bd0e-899be4c0bf9d
fix: correct Pebble iterator bounds and use pre-allocated txn prefix bytes
…unds, unexport TxnInternalKeyPrefix

Co-authored-by: bootjp <1306365+bootjp@users.noreply.github.com>
Agent-Logs-Url: https://github.com/bootjp/elastickv/sessions/ec0ea35a-7814-4c4c-b6c2-1669a33a05b2
Copilot AI changed the title [WIP] Fix: Address critical data safety, consistency, and performance issues fix: close Pebble batches on all paths, correct lock scan upper bound, unexport TxnInternalKeyPrefix Mar 25, 2026
Copilot AI requested a review from bootjp March 25, 2026 16:21
@bootjp bootjp closed this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants